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We introduce the Gaussian process (GP) modeling module 
developed within the UQLab software framework. The novel 
design of the GP-module aims at providing seamless 
integration of GP modeling into any — uncertainty 
quantification workflow, as well as a_ standalone surrogate 
modeling tool. We first briefly present the key mathematical 
tools on the basis of GP modeling (a.k.a. Kriging), as well as 
the associated theoretical and computational framework. We 
then provide an extensive overview of the available features 
of the software and demonstrate its flexibility and user- 
friendliness. Finally, we showcase the usage and_ the 
performance of the software on _ several applications 
borrowed from different fields of engineering. These include 
a basic surrogate of a _ well-known analytical benchmark 
function; a hierarchical Kriging example applied to wind 
turbine aero-servo-elastic simulations and a more complex 
geotechnical example that requires a non-stationary, user- 
defined correlation function. The GP-module, like the rest of 
the scientific code that is shipped with UQlLab, is open 
source (BSD license). 


1. Introduction 


Uncertainty quantification (UQ) through computer simulation is an interdisciplinary field that 
has seen rapid growth in the last decades. It aims at i) identifying and quantifying the uncertainty 
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in the input parameters of numerical models of physical systems, and 11) quantitatively assessing 
its effect on the model responses. Such a general formulation comprises some applications, 
including structural reliability [1], sensitivity analysis [2], reliability-based design optimization 
[3] and Bayesian techniques for calibration and validation of computer models [4]. 


Due to the high cost of repeatedly evaluating complex computational models, analyses with 
classical sampling techniques such as Monte Carlo simulation are often intractable. In this 
context, meta-modeling techniques (also known as surrogate modeling) allow one to develop 
fast-to-evaluate surrogate models from a limited collection of runs of the original computational 
model, referred to as the experimental design [5—7]. Popular surrogate modeling techniques 
include Kriging [8], polynomial chaos expansions [9,10] and support vector regression [11]. 


Kriging is a surrogate modeling technique first conceived by Krige [12] in the field of 
geostatistics and later introduced for the design and analysis of computer experiments by Sacks 
et al. [8] and Welch et al. [13]. The potential applications of Kriging in the context of civil and 
mechanical engineering, range from basic uncertainty propagation to reliability and sensitivity 
analysis [14-18]. Beyond approximating the output of a computational model, Kriging 
surrogates also provide local estimates of their accuracy (via the variance of the Kriging 
predictor). This enables adaptive schemes, e.g. in the context of reliability analysis [19,20] or 
surrogate model-based design optimization [21,22]. The local error estimates of a Kriging 
surrogate have also led to improved Bayesian calibration of computer models (see, e.g. Bachoc 
et al. [23]). 


Although in its standard form Kriging is a stochastic interpolation method, certain extensions 
have been proposed for dealing with noisy observations. Such extensions have been of particular 
interest to the machine learning community, and they are commonly referred to as Gaussian 
process regression [24]. 


Some dedicated toolboxes are readily available for calculating Kriging surrogate models. Of 
interest to this review is general purpose software not targeted to specific Kriging applications, 
because they are typically limited to two or three-dimensional problems (see, e.g. gslib [25]). 
Within the R community, one of the most comprehensive and well-established Kriging packages 
is arguably DiceKriging, developed by the DICE consortium [26]. This set of packages provides 
Kriging meta-modeling as part of a framework for adaptive experimental designs and Kriging- 
based optimization based on the packages DiceDesign and DiceOptim [27,28]. scikit-learn 
provides a Python-based, machine-learning-oriented implementation of Gaussian processes for 
regression and classification [29]. Alternatively, PyKriging [30] offers a Kriging toolbox in 
python that offers basic functionality with a focus on user-friendliness. Gpy [31] offers a 
Gaussian process framework with a focus on regression and classification problems. Within the 
Matlab programming language, the first Kriging toolbox with widespread use was DACE [32]. 
DACE was later extended to ooDACE [33], an object-oriented Kriging implementation with a 
richer feature set. Small Toolbox for Kriging [34] offers an alternative Kriging implementation 
that is mainly focused on providing a set of functions for Kriging surrogate modeling and design 
of experiments. GPML [35] offers a library of functions that are directed towards Gaussian 
processes for regression and classification in a machine learning context. Finally, recent versions 
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of Matlab (starting from R2015b) provide a rapidly growing Gaussian process library for 
regression and classification. 


Due to the variety of potential applications of Kriging, different toolboxes tend to be focused on 
a specific user niche. There is limited availability of general purpose Kriging toolboxes that 
allow for seamless integration within various UQ workflows ranging from, e.g. basic uncertainty 
propagation to reliability analysis and surrogate-model-based optimization. To this end, the 
Kriging toolbox presented here was developed as a module of the general purpose UQ 
framework, UQLab ([36], www.uqlab.com). Also, although most of the toolboxes above offer a 
significant set of configuration options, the support for fully customizable Kriging is often 
limited or not easily accessible, which can be a drawback in a research environment. Finally, the 
user experience may vary from user-friendly to complex (especially to access the most advanced 
features), often requiring a significant degree of programming knowledge. This might be rather 
inconvenient for applied scientists and practitioners with limited programming knowledge. 
Following these premises, this paper introduces the UQLab Gaussian process modeling tool (GP- 
module) focusing on its unique embedding into a complex uncertainty quantification 
environment, its user-friendliness and customisability. 


The paper is structured as follows: in Section 2 a theoretical introduction to Kriging is given to 
highlight its main building blocks. In Section 3 the key-features of the GP-module are presented. 
Finally, a set of application examples is used to showcase in detail the usage of the software in 
Section 4, followed by a summary and a roadmap of the upcoming developments in Section 5. 


2. Kriging theory 


2.1. Kriging basics 


Any metamodeling approach, such as Kriging, aims at approximating the response of a 
computational model given a finite set of observations. In this context, consider a system whose 
behavior is represented by a computational model.M which maps the M-dimensional input 


parameter space D_to the 1-dimensional output space, ie, M:xeD. CR" te yeR where 
T 
ele ee 


Kriging is a meta-modeling technique which assumes that the true model response is a 
realization of a Gaussian process described by the following equation [5]: 


M* (x) =B f(x) +o0°Z(x,0) (1) 


Where B'f(x) is the mean value of the Gaussian process, also called a trend, o* is the Gaussian 
process variance and Z(x,q@)is a zero-mean, unit-variance Gaussian process. This process is 
fully characterized by the auto-correlation function between two sample points R(x,x’;0). The 
hyperparameters 0 associated with the correlation function R(-;@) are typically unknown and 
need to be estimated from the available observations. 
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Having specified the trend and the correlation function parameters it is possible to obtain an 
arbitrary number of realizations of the so-called prior Gaussian process (see Figure | left). In the 


context of metamodelling, the goal is to calculate a prediction M"(x) for a new point x, given 


an experimental design ¥ Sek of size N and the corresponding (noise-free) model 


responses y =fy” = Mx"), y= Mx)", A Kriging metamodel (a.k.a. Kriging predictor) 
provides such predictions based on the properties of the so-called posterior Gaussian process 
conditioned on the available data (see Figure 1 right). The Kriging prediction x corresponds to a 
random variate Y(x) ~ VV (4; (x),0, (x)) . Therefore the approximation of the computational 
model that is obtained is essentially an infinite family of such models. Each of these models is a 
realization (or sample) of the posterior Gaussian process. In practice, the mean response is used 
(see Eq. (6)) as the Kriging surrogate, while its variance (see Eq. (7)) is often interpreted as a 
measure of the local error of the prediction. The equations for calculating the mean and variance 
of a universal Kriging predictor are given next. 


20 20 


M(a 


0 5 10 15 0 5 10 15 


x x 
Fig. 1. Realisations of a prior (left) and posterior Gaussian process (right). The Gaussian process mean in 
each case is denoted by a black line. 


The Gaussian assumption states that the vector formed by the true model responses, y and the 


prediction, Y(x), has a joint Gaussian distribution defined by: 


{}. Ne ae 1 ek (2) 
y FB r(x) R 


where F is the information matrix of generic terms: 
F,=7,@"),i=1,...N,faL..oP, (3) 


r(x) is the vector of cross-correlations between the prediction pointxand each one of the 
observations whose terms read: 


r(x) =R(x, x”; 6), i=1,..., NV. (4) 
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R is the correlation matrix given by: 


R, =R(K”,x'";8), ij=l,...,N- (5) 


The mean and variance of the Gaussian random variate Y(x) (a.k.a. mean and variance of the 
Kriging predictor) can be calculated based on the best linear unbiased predictor properties [5]: 


Ll, (x) = £ (x) B+4r(x)'R'(y-FB), (6) 
O.(x)=0° (1 —r(x)R'r(x) +u(x)’ (FR"F) | u(x)) (7) 
where: 

B=(F'R'F) F'R'y (8) 


is the generalized least-squares estimate of the underlying regression problem and 
u(x) =F'R ‘r(x)-f (x). (9) 


Once “,(x) and oO, (x) are available, confidence bounds on predictions can be derived by 


observing that: 


Piya <r] -o( SH) (10) 


O; (x) 


Table 1 
Formulas of the most commonly used Kriging trends. 


Trend Formula 


constant (ordinary Kriging) ee 
Linear B+ by BX, 


M M OM 
quadratic By + 2 Bx, + a 2 BX; 


Where ®(-) denotes the Gaussian cumulative distribution function. Based on Eq. (10) the 
confidence intervals on the predictor can be calculated by: 


Y(x)e [wo [1-$) o.(X), U(x) +0" [1-$) a.) (11) 


moreover, can be interpreted as the interval within which the Kriging prediction falls with 
probability 1-a. 
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The equations that were derived for the best linear unbiased Kriging predictor assumed that the 
covariance function o°R(-;0) is known. In practice, however, the family and other properties of 
the correlation function need to be selected a priori. The hyperparameters 6, the regression 


coefficients, and the variance o’ need to be estimated based on the available experimental 
design. This involves solving an optimization problem that is further discussed in Section 2.4. 
The resulting best linear unbiased predictors are called empirical in Santner et al. [5] because 
they typically result from the empirical choice of various Kriging parameters that are further 
discussed in Sections 2.2 - 2.4. 


2.2. Trend 


The trend refers to the mean of the Gaussian process, i.e., the B'f(x) term in Eq. (1). Using a 


non-zero trend is optional, but it is often preferred in practice (see, e.g. Rasmussen and Williams 
[24], Sch6ébi et al. [37]). Note that the mean of the Kriging predictor in Eq. (6) is not confined to 
be zero when the trend is zero. 


In the literature, it is customary to distinguish between Kriging metamodels depending on the 
type of trend they use [5,24,38]. The most general and flexible formulation is universal Kriging, 
which assumes that the trend is composed of a sum of P arbitrary functions f,(x), ie. 


B'f(x) =) Bf, (x). (12) 


Some of the most commonly used trends for universal Kriging are given for reference in Table 1. 
Simple Kriging assumes that the trend has a known constant value, i.e. P=1, f(x) and £, is 


known. In Ordinary Kriging, the trend has a constant but unknown value, i.e. P=1, f(x) =land 


f, is unknown. 


2.3. Correlation function 


The correlation function (also called kernel in the literature, or covariance function if it includes 
the Gaussian process variance o’) is a crucial ingredient for a Kriging metamodel since it 
contains the assumptions about the function that is being approximated. An arbitrary function of 
(x,x’) is in general not a valid correlation function. In order to be admissible, it has to be chosen 
in the set of positive definite kernels. However, checking for positive definiteness of a kernel can 
be a challenging task. Therefore it is usually the case in practice to select families of kernels 
known to be positive definite and to estimate their parameters based on the available 
experimental design and model responses (see Section 2.4). A usual assumption is to consider 
kernels that depend only on the quantity h=||x-x’ 


| which are called stationary. A list of 


stationary kernels commonly used in the literature can be found in Table 2. Different correlation 
families result in different levels of smoothness for the associated Gaussian processes, as 
depicted in Figure 2 [24]. 
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In case of multidimensional inputs (M > 1), it is common practice to obtain admissible kernels as 
functions of one-dimensional correlation families like the ones in Table 2. Two standard 
approaches in the literature are the separable correlation type [8]: 


M 
R(x, x’; 8) = [ [2a pase.) 


i=l 


(13) 


Table 2 
List of available correlation families. 
Name Formula 
: h 
Linear R(h; 0) = max [o.-4) 
Al 
Exponential R(h; 0) = exp aa 
: 31h 31h 
Matérn 3/2 R(h;0) =| 1+ VBI) exp _N3IAl 
0 0 
Matérn 5/2 


Gaussian (squared exponential) 


rei) =[1 sell 2k Joo et 


30° 


R(h; 0) = on (4) 


0 5 


"Linear 

---"Matern 3/2 
Matern 5/2 

—— Gaussian 


10 15 


x 
Fig. 2. Realisations of Gaussian processes, characterized by various correlation families and the same 
length-scale (0) value. 


and the ellipsoidal type [24]: 


M =a 


i=l 


(14) 
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Although typically 6<¢R” this is not necessarily true in the general case, since the number of 
components of 0 that corresponds to each input dimension may vary. In the current stage, it is 
assumed however that one element of 8 is used per dimension for notational clarity. 


In certain scenarios (e.g., based on prior knowledge), isotropic correlation functions can be used 
for multidimensional inputs. In that case, the same correlation function parameters 8 are used for 
each input dimension in Eq. (13), and Eq. (14). 

2.4. Estimating the hyperparameters 


In most practical applications of Kriging surrogate modeling, the hyperparameters 9 are 
estimated given an experimental design ¥Y and model responses y. Maximum likelihood and 
cross-validation are the most commonly used methods for doing so and further discussed next. 


The maximum likelihood approach aims at finding the set of parameters B, 8, o° such that the 


likelihood of the observations y {M(x”),.... Mx)! is maximal. Since y follows a 


multivariate Gaussian distribution, the likelihood function reads: 


I Paae ] 
s(y-FB) R (y—-FB) |. (15) 
20 | 


r d Ry)” 
L(y |B,07,0) = _ p|- 


(210°)*”” 


For any given value of 8, the maximization of the likelihood wrt. B, and o° is a convex 
quadratic programming problem. Consequently, it admits closed form generalized least-squares 
estimates of B and o° (for proof and more details see, e.g. Santner et al. [5]): 


B=p0)=(F'R'F) F'R'y, (16) 


o=0°@)=—(y FB) R'(y-Ff). (17) 
The value of the hyperparameters 9 is calculated by solving the optimization problem: 

6 = arg min (—log L(y |B,o*,6)) (18) 
Based on Eqs (15) - (17) the optimization problem in Eq. (18) can be written as follows: 

0 = argmin{ Log (det(R)) + tg(220°) +%) (19) 


The cross-validation method (also known as K-fold cross-validation) is based instead on 


def 
partitioning the whole set of observations S={,y} into K mutually exclusive and collectively 
exhaustive subsets {S,, k=1,...,K} such that 
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S08, =@, V(i,j) €{L...,.K} and US, =S (20) 
k=l 


The k-th set of cross-validated predictions is obtained by calculating the Kriging predictor using 
all the subsets but the k-th one and evaluating its predictions on that specific k-th fold that was 
left apart. The leave-one-out cross-validation procedure corresponds to the special case that the 
number of classes is equal to the number of observations (K = N). 


In the latter case the objective function is [5,39]: 


0 =arg min ((x"")~ a es (x)) (21) 


Where yu, ., (x) is the mean Kriging predictor that was calculated using S\{x° yh 


evaluated at a point x“. Notice that for the case of leave-one-out cross-validation, i is an index, 
but in the general case, i is a vector of indices. Calculating the objective function in Eq. (21) 
requires the calculation of K Kriging surrogates. The computational requirements for performing 


this operation can be significantly reduced as shown in Dubrule [40]. 


The estimate o° is calculated using the following equation [39,41]: 


1a (Me)-a.(2)) 2) 


whereo”, _, (x‘”) is the variance of a Kriging predictor that was calculated using S \{x" sy} ; 


evaluated at a point x’. When i is a set of indices, the division and the squared operations in Eq. 
(22) are performed element-wise. 


Numerically solving the optimization problems described in Eq. (19) (maximum likelihood case) 
alternatively, Eq. (21) (cross-validation case) relies on either local (e.g., gradient-based) or global 
(e.g., evolutionary) algorithms. On the one hand, local methods tend to converge faster and 
require fewer objective function evaluations than their global counterparts. On the other hand, 
the existence of flat regions and multiple local minima, especially for larger input dimension, can 
lead gradient methods to poor performance when compared to global methods. It is common 
practice to combine both strategies sequentially to improve global optimization results with a 
final local search (which is also known as hybrid methods). 


It can often be the case in engineering applications that different components of the input 
variable x take values that differ by orders of magnitude. In such cases, potential numerical 
instabilities can be avoided by scaling YH U, e.g., as follows: 
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,i=l,...,.N, j=L....M (23) 


where u'” (resp. x;”) refer to the i-th sample of the j-th component U (resp. of 4) and | Y, | 


Var [ 2, | refer to the empirical mean and variance of the j-th component of 7 . 


3. The UQLab Gaussian process modeling module 


3.1. The UQLab project 


UQLab is a software framework developed by the Chair of Risk, Safety and Uncertainty 
Quantification at ETH Ziirich [36]. The goal of this project is to provide an uncertainty 
quantification tool that is accessible also to a non-highly-IT trained scientific audience. Due to 
the broadness of the UQ scope, a correspondingly general theoretical framework is required. The 
theoretical backbone of the UQLab software lies in the global uncertainty framework developed 
by Sudret [42], De Rocquigny et al. [43], sketched in Figure 3a. According to this framework, 
the solution of any UQ problem can generally be decomposed into the following steps: 


Step A Define the physical model and the quantities of interest for the analysis. It is 
a deterministic representation of an arbitrarily complex physical model, 
e.g., a finite element model in civil and mechanical engineering. In this 
category also lie metamodels, such as Kriging, since once they are 
calculated, they can be used as surrogates of the underlying “true” model. 


Step B Identify and quantify the sources of uncertainty in the parameters of the 
system that serves as input for Step A. They are represented by a set of 
random variables and their joint probability density function (PDF). 


Step C Propagate the uncertainties identified in Step B through the computational 
model in Step A to characterize the uncertainty in the model response. This 
type of analyses includes moments analysis, full PDF characterization, rare 
events estimation, sensitivity analysis, etc. 


Step C’ Optionally, exploit the by-products of the analysis in Step C to update the 
sources of uncertainty, e.g., by performing model reduction based on 
sensitivity analysis. 
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STEP B: Input 


Random variables, 
random fields 


>| Computational 


STEP A: Model 


STEP C: Analysis 


Moments estima- 
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model, surrogate 
model 


tion, reliability 
analysis, etc. 


STEP C' 


be 
Sensitivity analysis 
etc. 


a The theoretical UQ framework based on which any UQ problem can be described. 


UQLab core 


ANALYSIS 


-—— Model 1 Analysis 1 


t—— Model 2 Analysis 2 
-———— Model M Analysis J 


b The modular structure of the UQLab framework. An arbitrary number of objects (Input, Model, 
Analysis) can be connected at any stage of the UQ problem. 


Fig. 3. An abstract illustration of the UQLab architecture (b) based on the theoretical UQ framework in 
(a) by Sudret [42]. 


These components introduce a clear semantic distinction between the elements involved in any 
UQ problem: model, input, and analysis. This theoretical framework provides the ideal 
foundation for the development of the information flow model in a multi-purpose UQ software. 


At the core of UQLab lies a modular infrastructure that closely follows the semantics previously 
described, graphically represented in Figure 3b. The three steps identified in Figure 3a are 
directly mapped to core modules in Figure 3b: model corresponds to Step A (physical modeling, 
metamodeling), input to Step B (sources of uncertainty) and analysis to Step C (uncertainty 
analysis). Within the UQLab framework, a module refers to some particular functionality, e.g., 
the GP-module provides Kriging surrogate modeling. Each module extends the functionalities of 
one of the core modules. It can be either self-contained or capitalize on other modules for 
extended functionalities. 


The real “actors” of a UQ problem are contained in the objects connected to each of the core 
modules. A typical example of such objects would be an input object that generates samples 
distributed according to arbitrary PDFs, a model object that runs a complex FEM simulation, or 
an analysis object that performs reliability analysis. The platform allows one to define an 
arbitrary number of objects and select the desired ones at various stages of the solution of a 
complex UQ problem. 
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UQLab first became freely available to the academic community on July 2015 as a beta version. 
On April 2017 the version 1.0 of UQLab was released. Starting from version 1.0 all the scientific 
code of the software is open-source (BSD license). By May 2018 around 1300 users have 
already registered and used it. 


3.2. The GP-module 


Kriging is one of the metamodelling modules available in UQLab [44]. Following the semantics 
described in the previous section, it is attached to the model core module. Although other 
modules can use the GP-module itself, e.g., an analysis module performing reliability analysis 
combining Kriging and Monte Carlo Simulation (AK-MCS) [19,45], the focus of this work is on 
the capabilities of the GP-module itself. 


An overview of the available features of the GP-module is given in Table 3. The GP-module 
incorporates the four ingredients identified in Section 2.1: 


e Trends: Universal Kriging trends are fully supported, including simple, ordinary, or 
polynomial of arbitrary degree. Also, custom basis functions f(x) or a completely custom 
trend function may be specified 

e Correlation functions: Standard correlation families from the literature are readily available 
as well as the possibility of creating user-defined ones. For multi-dimensional inputs, 
ellipsoidal and separable correlation functions can be used, also allowed for isotropic ones. 
Fully user-specified correlation functions are also supported 

e Estimation methods: Maximum likelihood (Eq. (19)) and cross-validation (Eq. (21)) 
methods can be used for estimating the hyper-parameters 

e Optimisation methods: Matlab’s built-in local and global optimization methods are offered, 
namely BFGS and genetic algorithm as well as a genetic algorithm with BFGS refinement 
(hybrid). 


In addition, various scaling operations are allowed for avoiding numerical instabilities during the 
hyperparameters estimation. Such operations may vary from simple zero-mean scaling to more 
advanced ones such as iso-probabilistic transformations by interfacing with other UQLab 
modules. 


Following the general design principle of UQLab concerning user-friendliness, all the possible 
configuration options have default values pre-assigned to allow basic usage of the module with 
very few lines of code (see Section 4.1). A Matlab structure variable is used to specify a Kriging 
configuration, called Koptions in the following sections. 


To showcase the minimal working code for obtaining a Kriging surrogate, a simple application is 
considered. The experimental design consists of 8 random samples in the [0,15] interval, and it is 
contained in the variable xED. The “true” model is M(x) = xsin(x) , and the corresponding model 
responses are stored in the variable yep. The minimal code required for obtaining a Kriging 
surrogate, given XED and YED Is the following: 


KOptions.Type = ’Metamodel’; 
KOptions.MetaType = ’Kriging’; 
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KOptions.ExpDesign.X = XED; 
KOptions.ExpDesign.Y = YED; 
myKriging = uq createModel (KOptions) ; 


The first line clarifies the type of UQLab object that is being requested. Following the general 
UQ Framework in Figure 3a a model object of type ’Metamodel’ is created. The next line 
specifies the type of metamodel, followed by the manual specification of the experimental 
design. Finally, the UQLab command uq createModel is used in order to create a model object 
using the configuration options in KOptions. 


The resulting Kriging metamodel object mykriging contains all the required information to 
compute the mean and variance of the Kriging predictor on new test points (x). This can be done 
using the following command: 


[meanY, varY] = uq_evalModel (myKriging, X); 
where meany corresponds to the mean and vary to the variance of the Kriging predictor on the 
test points (see Eqs. (6), (7)). 


Once the metamodel is created, a report of the main properties of the Kriging surrogate model 
can be printed on screen by: 


uq_ print (myKriging) ; 


al Kriging metamodel -------------- % 
Object Name: Model 1 Input Dimension: 1 
Experimental Design 

Sampling: User 

X size: [8x1] 

Y size: [8x1] 

Trend 

Type: ordinary 

Degree: 0 


Gaussian Process 


—Kriging approximation 
15 |lmm95% confidence interval 
* Observations 


-15 
0 5 10 
xX 


Fig. 4. The output of uq_display of a Kriging model object having a one-dimensional input. 
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Table 3 
List of features of the UQLab GP-module. The default values for each property is in bold. 
Feature Specification Value Description 
Trend Simple A constant term specified by the user (simple 
Kriging) 
Ordinary A constant term estimated using Eq. (8) (ordinary 
Kriging) 
Polynomial basis The trend in Eq. (12) consists of polynomial basis 
function f;, of arbitrary degree 
Custom basis The trend in Eq. (12) consists of arbitrary functions fx 
Custom trend Custom trend function that computes F directly 
Correlation Types Separable As described in Eq. (13). Both isotropic and 
anisotropic variants are supported. 
Ellipsoidal As described in Eq. (14). Both isotropic and 
anisotropic variants are supported. 
Custom Custom correlation function that computes R_ directly 
Families Commonly used All the correlation families reported in Table 2 are 
available 
Custom A custom correlation family can be specified 
Estimation ML Maximum-likelihood estimation (see Eq. (19)) 
CV K-fold Cross-Validation method (see Eq. (21)). 
Any K value is supported 
Optimisation BFGS Gradient-based optimization method 
(BroydenFletcher-Goldfarb-Shanno algorithm). Matlab 
built-in 
GA Global optimization method (genetic algorithm). 
Matlab built-in 
HGA Genetic algorithm optimization with BFGS refinement 
Corr. Type: ellipsoidal (anisotropic) 
Corr. family: matern-5 2 
sigma’ 2: 4.787983e+01 


Estimation method: Cross-Validation 


Hyperparameters 
theta: [ 0.00100 ] 
Optim. method: Hybrid Genetic Algorithm 


4.3698313e-01 


Leave-one-out 


rror: 


Ch. Lataniotis et al./ Journal of Soft Computing in Civil Engineering 2-3 (2018) 91-116 105 


It can be observed that the default values for the trend, correlation function, estimation, and 
optimization method have been assigned (see Table 3). A visual representation of the metamodel 
can be obtained by: 


uq_ display (myKriging) ; 
Note that the uq_ display command can only be used for quickly visualizing Kriging 
surrogates when the inputs are one- or two-dimensional. The figure produced by uq_display 
is shown in Figure 4. 


4. Application examples 


4.1. Basic example 


The goal of this introductory example is to calculate a Kriging surrogate of a well-known 
surrogate modeling benchmark, the Branin-Hoo function. This function has been traditionally 
used as a benchmark for global optimization methods (see, e.g. Jones et al. [46]). A slightly 
modified version is considered this work, that was first proposed as a surrogate modeling 
benchmark by Forrester et al. [7] due to its representative shape concerning engineering 
applications. It is an analytical function given by: 


M(x) =a Hs, — bx; + cx, =r) +s(1-t)cos(x,)+s, xeR’. (24) 


Some standard values of the parameters are used, namely a = 1, b= 5.1 (477), c= 5/m, r= 6, 5= 
10 and t = 1/(87). The function is evaluated on the square x1 € [-5,10], x2 € [0,15]. 


By taking advantage of the input and model modules of UQLab, the experimental design and 
model responses that will be used for calculating the surrogate can be generated with minimal 
effort. First, the probabilistic input model and the true model are defined as follows: 


6 Start the UQLab framework 

uqlab; 

% Specify the probabilistic input model IOptions.Marginals(1).Type = 
‘Uniform’ ; 


TOptions.Marginals(1).Parameters = [-5, 10]; 
TOptions.Marginals(2).Type = ’Uniform’; IOptions.Marginals (2) .Parameters 
= [0, 15]; 


myInput = uq_createInput (IOptions) ; 
% Specify the computational model 

MOptions.mString = [’ (X(:,2) - .1/(2*pi)*2*X(:,1).*2 + 5/pi*X(:,1) 
- 6).%2'’ + 10* (1-1/(8*pi))*cos(X(:,1)) + 10']; 


myModel = uq_createModel (MOptions) ; 
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Fig. 5. From left to right: the Branin-Hoo function (true model) followed by the mean and standard 
deviation of the Kriging predictor. Red dots illustrate the experimental design. 


Note that the model object of the Branin-Hoo function can be equally coded in a Matlab m-file 
or written as a string (which is a useful feature for simple demo functions only). 


Next, the experimental design xED is generated along with the corresponding true model 
responses YED. The Latin Hypercube Sampling (LHS) method is used to obtain a space-filling 
experimental design of 15 samples [47]: 


% Draw 15 samples using Latin Hypercube Sampling 
ED = uq_getSample(15, 'LHS’); 
Calculate the corresponding model responses YED = uq_evalModel (myModel, 
ED); 
A Kriging surrogate model using the xED, YED variables can be created as follows: 


x 


fo) 


x 


KOptions.Type = ’Metamodel’; 

KOptions.MetaType = ’Kriging’; 

KOptions.ExpDesign.Sampling = ‘user’; 

KOptions.ExpDesign.X = XED; 

KOptions.ExpDesign.Y = YED; 

myKriging = uq_ createModel (KOptions) ; 
All the required ingredients for obtaining a Kriging surrogate are assigned default values unless 
specified by the user (see Section 3.2). The surrogate that is obtained can be visually inspected 


by issuing the command: 


uq_ display (myKriging) ; 
The result of the uq_display command is shown in Figure 5. The Kriging surrogate myKriging 
can be used like any other model (e.g., myMode1) to calculate its response given a new sample of 
the input x using the uq_evalModel function. For example, the mean predictor, meany, of 100 
samples generated by Monte Carlo sampling can be computed as follows: 

X = uq_getSample (100); 

meanY = uq evalModel (myKriging, X); 
More information can be extracted from the Kriging predictor using a slightly different syntax. 
The following code: 


[meanY, varY, covY] = uq_evalModel (myKriging, X); 


Ch. Lataniotis et al./ Journal of Soft Computing in Civil Engineering 2-3 (2018) 91-116 107 


allows to retrieve the 100 x 1 Kriging mean meany, the 100 x 1 Kriging variance vary and the 
100 x 100 full covariance matrix of the surrogate model responses covy. 


Low-fidelity Kriging High-fidelity Kriging Hierarchical Kriging 
(RMSE = 0.552) (RMSE = 0.545) (RMSE = 0.174) 
8000 8000 8000 
7000 4 7000 
<p 6000 6000 
“5b 5000 5000 
= 4000 4000 
£3000 3000 
2000 2000 
1000 1000 1 
2000 4000 6000 8000 2000 4000 6000 8000 2000 4000 6000 8000 
Y (true) Y (true) Y (true) 


Fig. 6. Comparison of true model output (from high fidelity simulations) versus various Kriging 
surrogates on a validation set of size 150. 


4.2. Hierarchical kriging 


To further illustrate the flexibility that can be achieved with the use of arbitrary trend functions, a 
hierarchical Kriging application is showcased. Hierarchical Kriging [48] is one Kriging 
extension aiming to fuse information from experimental designs related to different physical 
models of different fidelity. This is achieved by first calculating a Kriging surrogate using the 
low-fidelity observations and then using it as the trend of the high-fidelity surrogate. This 
approach can be extended to more fidelity levels in a similar fashion. A set of observations and 
model responses is used that originates from aero-servo-elastic simulations of a wind-turbine as 
presented in Abdallah et al. [49]. Given a set of input parameters related to the wind flow, the 
output of interest is the maximal bending moment at the blade root of a wind turbine. 


Two types of simulators are available for estimating the maximal bending moment given the 
wind conditions. A low-fidelity simulator can generate estimates of the output with minimal 
computation time at the cost of lower accuracy. On the other hand, a high-fidelity simulator can 
more accurately predict the maximal bending moment at a significantly higher computational 
cost. In this example a total of 300 low-fidelity and 15 high-fidelity simulations are available. 
First, a Kriging surrogate is computed on the low-fidelity dataset that is contained in variables 
XED_LF, YED_ LF as follows: 


fo) 


% Create the low-fidelity surrogate 
KOptions LF.Type = ’Metamodel’; 
KOptions LF.MetaType = ’Kriging’; 
KOptions LF.ExpDesign.X = XED LF; 
KOptions LF.ExpDesign.Y = YED LF; 
KOptions LF.Corr.Family = ’Matern-3 2’; 
myKriging LF = uq_ createModel (KOptions LF); 
Using the same configuration options, another Kriging surrogate is computed using the high- 


fidelity dataset (XED_HF and YED_ HF): 


6 Create the high-fidelity surrogate 
KOptions HF.Type = ’Metamodel’; 
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KOptions HF.MetaType = ’Kriging’; 

KOptions HF.ExpDesign.X = XED HF; 

KOptions HF.ExpDesign.Y = YED HF; 

KOptions HF.Corr.Family = ’Matern-3 2’; 
myKriging HF = uq_createModel (KOptions HF); 


Now a hierarchical Kriging surrogate is computed which is trained on the high-fidelity dataset 
but uses the low-fidelity Kriging surrogate (i.e., its mean predictor) as a trend: 


% Create the hierarchical Kriging surrogate 


KOptions Hier.Type = ’Metamodel’; 

KOptions Hier.MetaType = ’Kriging’; 

KOptions Hier.ExpDesign.X = XED_ HF; 

KOptions Hier.ExpDesign.Y = YED HF; 

KOptions Hier.Corr.Family = ’Matern-3 2’; 

KOptions Hier.Trend.Type = ’custom’; 

KOptions Hier.Trend.CustomF = @(x) uq_evalModel (myKriging LF, x); 


KOptions Hier.Scaling = false; 
myKriging Hier = uq createModel (KOptions Hier); 


The option KOptions Hier.Scaling refers to the scaling of the input space before computing 
the surrogate model. In case of hierarchical Kriging scaling should be disabled because the low- 
fidelity surrogate is calculated on the original data and needs to be used “as is”. 


The performance of the different surrogate models is tested on a separate validation set of 150 
high-fidelity simulations that is contained in the variables xvAL_HF and yvaL_HF. The output 
mean Kriging predictor on the validation set is calculated as follows: 


meanY LF = uq _evalModel (myKriging LF, XVAL HF); 
meanY HF = uq_ evalModel (myKriging HF, XVAL HF); 
meanY Hier = uq evalModel (myKriging Hier, XVAL HF); 


where meanY_ LF, meanY_ HF and meanY_ Hier correspond to the low-fidelity, high-fidelity and 
hierarchical Kriging predictors respectively. 


In Figure 6 a comparison of the true model output yvAL_HF versus the mean Kriging predictors is 
made. In each case the Root Mean Square Error (RMSE) is reported for quantifying the 
predictive performance of the surrogate: 

1 N 


@ w@\ 
- a — nN 25 
ee 4 2 (Y° -u;°) (25) 


where Y denotes the true model outputs (in this case YVAL_HF), “, the Kriging predictor mean (in 
this case variables meanY_LF,meany_HF and meany_ Hier for each surrogate, respectively) and N 


the number of samples in the validation set. 


In this example, by taking advantage of the low-cost, low-fidelity observations, the hierarchical 
Kriging predictor achieves a 68% decrease of the RMSE on the validation set compared to the 
Kriging model that was solely based on the high-fidelity measurements. Moreover, by inspecting 
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the mean responses of each Kriging predictor in Figure 6, it is clear that the hierarchical Kriging 
surrogate significantly reduces the prediction bias compared to the low- and high-fidelity ones 
taken as standalone. As demonstrated by this application, building a hierarchical Kriging 
surrogate model requires minimal effort thanks to the customisability of the GP-module. 


4.3. Kriging with custom correlation function 
This example illustrates how the correlation function customization capabilities of the GP- 


module can be used to apply Kriging in a non-standard setting. 


Consider the discontinuous subsurface model given in Figure 7, which may represent the 
distribution of some soil property (e.g., porosity) in the presence of a fault. The true model 
consists of two realizations of two distinct random processes on the two regions A1 and A? at the 
left and right of the fault, respectively: 


Z,(x,R(0,)), xe A, 


26 
Z,(x,R(0,)), xEA, (26) 


M(x) = 


where x={.x,,x,} represents the spatial coordinates in the 2D domain, Z; (resp. Zz) are 


realizations of a Gaussian process characterized by a correlation function with length 
scales 0, = ae (resp. 6, = Oats ty: 


A Kriging surrogate model will be calculated using the following correlation 
function: 
R(x, x61), (xx')e A, xA, 
R(x,x';8) = R(x,x'302), (x,x')e A, xA, (27) 


0 , otherwise 


Fig. 7. Graphical visualization of the subsurface model. The unknowns (length scales of each random 
field and the fault angle) are denoted by red color. 
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Where 0={0,,0,,a}. There is a smooth dependence on x,,x, within each region, but no 
correlation between points that belong to different regions. The boundary between the two 
regions is fully defined by the crack angle, a, which is unknown and the fault location that is 
assumed to be known ({x,,x,}={0.6,1}). The goal here is to use Kriging to interpolate the 
measurements taken at borehole locations A, B, and C and estimate the five unknown parameters 
6 ={6,,0,,a}. The correlation function of each region is the same, both in the true model and the 
Kriging surrogate, i.e., it is assumed to be known. In particular, the correlation function is 
separable Matérn 3/2 (see Eq. (13) and Table 2). The maximum-likelihood method is selected 
for estimating 8. Due to the complexity of the underlying optimization problem a hybrid genetic 


algorithm with relatively large population size and a maximum number of generations is 
selected. 


A Matlab implementation of the correlation function in Eq. (27) is given in Appendix A. This 
Matlab function is called my eval_R in the following code snippet. 


M . Oe 
‘ (x) , iu 4 re 

F ; 0.8 1 
: E 0.6 0.5 

; ; 0.4 0 
F ; 0.2 -0.5 

0 -1 

0 0.5 1 
a 


0 0.5 1 0 0.5 1 
pal ry 


Figure 8: From left to right: The true permeability of the soil, followed by the mean and standard 
deviation of the Kriging predictor. Red dots illustrate the experimental design. 


The Kriging surrogate is created next, based on a limited set of observations contained in the 
variables BoreholeLocations and BoreValues, which contain the locations of the 
measurements along the boreholes and the value of the desired property, respectively. 


KOptions.Type = ’Metamodel’; 
KOptions.MetaType = ’Kriging’; 
KOptions.ExpDesign.X = BoreholeLocations; 
KOptions.ExpDesign.Y = BoreValues; 
KOptions.Corr.Handle = @my_ eval R; 

% Add upper and lower bounds on the optimization variables 
BoundsL = [0.3 0.1 0.3 0.1 pi/6] ; 

BoundsuU = [0.9 0.5 0.9 0.5 5*pi/6] ; 


KOptions.Optim.Bounds =[BoundsL ;BoundsU]; 
KOptions.Optim.Method = ’HGA’; 
KOptions.Optim.HGA.nPop = 60; 
KOptions.Optim.MaxIter = 50; 
KOptions.EstimMethod = ’ML’; 
KOptions.Scaling = False; 

myKriging = uq createModel (KOptions) ; 
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Once the Kriging metamodel has been computed, the mean and standard deviation of the Kriging 
predictor can be quickly visualized for 1D and 2D models using the uq display command, 
which produces a plot similar to Figure 8, except in a smaller domain determined by the range of 
the points in the experimental design. A comparison between the true and the estimated values, 0 
is given in Table 4. As expected, the accuracy of the hyperparameters estimation is low due to the 
limited dispersion of the experimental design. The error of the length scale estimates along the x1 
direction is consistently larger due to the lack of samples along that direction. From a coding 
perspective, although the correlation function that is used is relatively complex, it is 
straightforward to use in a Kriging surrogate once coded as a Matlab function (by setting the 
KOptions.Corr.Handle value appropriately). Moreover, custom correlation functions are 
allowed to have an arbitrary number of hyperparameters. The only requirement is that the 
optimisation bounds (or initial value, depending on the optimisation method that is used) must 
have the same length as the number of the hyperparameters. 


Table 4 
Listing of the true and estimated correlation function parameters, 8, for the Kriging surrogate of the 
subsurface model. 


Parameter O11 O12 621 622 a 
True value 0.600 0.250 0.900 0.350 1309 
Estimated value 0.310 0271 0310 0374 1.342 
Relative error (%)| 483 82 656 69 2.5 


5. Summary and outlook 


In this paper, the GP-module of the UQLab software framework was presented. This UQLab 
module enables practitioners from various disciplines to get started with Kriging metamodelling 
with minimal effort as was illustrated in the introductory application in Section 4.1. However, it 
is also possible to access more advanced customization, e.g., for research purposes. This was 
showcased in Section 4.2 where a hierarchical Kriging metamodel was developed and in Section 
4.3 where a relatively complex, non-stationary correlation function was used to solve a 
geostatistical inverse problem. The GP-module is freely available to the academic community 
since the first beta release of UQLab in July 2015. 


The current version of the GP-module only allows for computing Kriging models on noisy data 
by explicitly providing the noise level via the nugget effect. The general case where the noise 
level is unknown and needs to be estimated (a.k.a. Gaussian process regression) is currently 
under development and will be addressed in an upcoming release. In addition, the current version 
of the GP-module relies on additional Matlab toolboxes for performing the hyperparameter 
optimization. This may be a limiting factor for some users. 


In addition to the modules currently exploiting its functionality (Polynomial Chaos-Kriging and 
Reliability analysis [45,50]), new UQLab modules that interface with the GP-module are 
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currently under active development. The upcoming random fields module will offer several 
random field types (conditional and unconditional) together with advanced sampling 
methodologies and will be interfaced with the GP-module to offer trajectory resampling 
capabilities. Similarly, the upcoming Reliability-Based Design Optimisation (RBDO) module 
uses the surrogate modeling capabilities of the GP-module for solving RBDO problems as 


described in Moustapha et al. [22]. 


Appendix A. Kriging with custom correlation function: implementation 


details 


This section aims to provide some additional implementation details on the application example 
in Section 4.3, in terms of the Matlab code involved. The correlation function described in Eq. 
(27) can be translated to the following Matlab function: 


function R = my eval R( 


x1,x2,theta,parameters ) 


xc = 0.6; % the x-location of the crack on the surface 
yo = 1; % the y-location of the crack on the surface 


length scales 1 theta 
length scales 2 theta 
crack _angle = theta(5) 


fo) 


angles xl = acos( (xc - 
(xl(:,2) - yo).*2 ) ); 

© find the indices of x 
idx xl 1 = angles xl <= 


fo) 


% find the indices of x 


idx xl 2 = ~idx_x1_1; 

% find the angles of each sample of x2 
angles x2 = acos( (xc - 

(x2(:,2) - yc).*2 ) )F 


idx x2 1 = angles x2 <= 


idx x2 2 = ~idx x2 1; 


fo) 


(1:2); 
(3:4); 


, 


% find the angles of each sample of xl 


x1(:,1))./sqrt((x1(:,1) - xc).%2 + 
1 that belong to first region 


crack angle; 
1 that belong to second region 


x2(:,1))./sqrt((x2(:,1) - xc).*%2 + 


% find the indices of x2 that belong to first region 


crack angle; 


%& find the indices of x2 that belong to second region 


% set-up various correlation function options so that we can re-use the 


% build-in UQLab function for evaluating R in each region 
CorrOptions.Type = ‘separable’; 
CorrOptions.Family = '’Matern-3 2’; 


CorrOptions.Isotropic = 
CorrOptions.Nugget = le 
6 initialize R matrix 


false; 


-2; 


R = zeros(size(xl1,1), size(x2,1)); 
% Compute the R values in region 1 


R(idx xl 1,idx x2 1) = 
x2(idx x2 1,:),... 
length scales 1, CorrOp 


fo) 


% Compute the R values 


uq Kriging eval R( xl(idx_xl1_1,:), 


tions); 
in region 2 


R(idx xl 2,idx x2 2) = 
x2(idx x2 2,:),... 
length scales 2, CorrOp 
end 


uq Kriging eval _R( xl(idx_ xl 2,:), 


tions); 
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The provided code, although vectorized, is optimized for readability and not performance. To 
that end, the internal function of the GP-module uq Kriging eval _R is used for calculating the 
correlation function value in each of the regions. 
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